Gliomas are the most common primary cranial tumors arising from cancerous changes in the glia of the brain and spinal cord, with a high proportion of malignant gliomas and a significant mortality rate. Quantitative segmentation and grading of gliomas based on Magnetic Resonance Imaging (MRI) images is the main method for diagnosis and treatment of gliomas. To improve the segmentation accuracy and speed of glioma, a 3D-Ghost Convolutional Neural Network (CNN) -based MRI image segmentation algorithm for glioma, called 3D-GA-Unet, was proposed. 3D-GA-Unet was built based on 3D U-Net (3D U-shaped Network). A 3D-Ghost CNN block was designed to increase the useful output and reduce the redundant features in traditional CNNs by using linear operation. Coordinate Attention (CA) block was added, which helped to obtain more image information that was favorable to the segmentation accuracy. The model was trained and validated on the publicly available glioma dataset BraTS2018. The experimental results show that 3D-GA-Unet achieves average Dice Similarity Coefficients (DSCs) of 0.863 2, 0.847 3 and 0.803 6 and average sensitivities of 0.867 6, 0.949 2 and 0.831 5 for Whole Tumor (WT), Tumour Core (TC), and Enhanced Tumour (ET) in glioma segmentation results. It is verified that 3D-GA-Unet can accurately segment glioma images and further improve the segmentation efficiency, which is of positive significance for the clinical diagnosis of gliomas.
Current Video Super-Resolution (VSR) algorithms cannot fully utilize inter-frame information of different distances when processing complex scenes with large motion amplitude, resulting in difficulty in accurately recovering occlusion, boundaries, and multi-detail regions. A VSR model based on frame straddling optical flow was proposed to solve these problems. Firstly, shallow features of Low-Resolution frames (LR) were extracted through Residual Dense Blocks (RDBs). Then, motion estimation and compensation was performed on video frames using a Spatial Pyramid Network (SPyNet) with straddling optical flows of different time lengths, and deep feature extraction and correction was performed on inter-frame information through RDBs connected in multiple layers. Finally, the shallow and deep features were fused, and High-Resolution frames (HR) were obtained through up-sampling. The experimental results on the REDS4 public dataset show that compared with deep Video Super-Resolution network using Dynamic Upsampling Filters without explicit motion compensation (DUF-VSR), the proposed model improves Peak Signal-to-Noise Ratio (PSNR) and Structure Similarity Index Measure (SSIM) by 1.07 dB and 0.06, respectively. The experimental results show that the proposed model can effectively improve the quality of video image reconstruction.
To address the problems of poor effects, easily falling into suboptimal solutions, and inefficiency in neural network hyperparameter optimization, an Improved Real Coding Genetic Algorithm (IRCGA) based hyperparameter optimization algorithm for the neural network was proposed, which was named IRCGA-DNN (IRCGA for Deep Neural Network). Firstly, a real-coded form was used to represent the values of hyperparameters, which made the search space of hyperparameters more flexible. Then, a hierarchical proportional selection operator was introduced to enhance the diversity of the solution set. Finally, improved single-point crossover and variational operators were designed to explore the hyperparameter space more thoroughly and improve the efficiency and quality of the optimization algorithm, respectively. Two simulation datasets were used to show IRCGA’s performance in damage effectiveness prediction and convergence efficiency. The experimental results on two datasets indicate that, compared to GA-DNN(Genetic Algorithm for Deep Neural Network), the proposed algorithm reduces the convergence iterations by 8.7% and 13.6% individually, and the MSE (Mean Square Error) is not much different; compared to IGA-DNN(Improved Genetic Algorithm for Deep Neural Network), IRCGA-DNN achieves reductions of 22.2% and 13.6% in convergence iterations respectively. Experimental results show that the proposed algorithm is better in both convergence speed and prediction performance, and is suitable for hyperparametric optimization of neural networks.
In the existing Dual LAN (Local Area Network) Terahertz Wireless LAN (Dual-LAN THz WLAN) related MAC (Medium Access Control) protocol, some nodes may repeatedly send the same Channel Time Request (CTRq) frame within multiple superframes to apply for time slot resources and idle time slots exist in some periods of network operation, therefore an efficient MAC protocol based on spontaneous data transmission SDTE-MAC (high-Efficiency MAC Protocol based on Spontaneous Data Transmission) was proposed. SDTE-MAC protocol enabled each node to maintain one or more time unit linked lists to synchronize with the rest of the nodes in the network running time, so as to know where each node started sending data frames at the channel idle time slot. The protocol optimized the traditional channel slot allocation and channel remaining slot reallocation processes, improved network throughput and channel slot utilization, reduced data delay, and could further improve the performance of Dual-LAN THz WLAN. The simulation results showed that when the network saturates, compared with the new N-CTAP (Normal Channel Time Allocation Period) slot resource allocation mechanism and adaptive shortening superframe period mechanism in the AHT-MAC (Adaptive High Throughout multi-pan MAC protocol), the MAC layer throughput of the SDTE-MAC protocol was increased by 9.2%, the channel slot utilization was increased by 10.9%, and the data delay was reduced by 22.2%.
The existing multi-tampering type image forgery detection algorithms using noise features often can not effectively detect the feature difference between tampered areas and non-tampered areas, especially for copy-move tampering type. To this end, a dual-stream image tampering forensics network fusing residual feedback and self-attention mechanism was proposed to detect tampering artifacts such as unnatural edges of RGB pixels and local noise inconsistence respectively through two streams. Firstly, in the encoder stage, multiple dual residual units integrating residual feedback were used to extract relevant tampering features to obtain coarse feature maps. Secondly, further feature reinforcement was performed on the coarse feature maps by the improved self-attention mechanism. Thirdly, the mutual corresponding shallow features of encoder and deep features of decoder were fused. Finally, the final features of tempering extracted by the two streams were fused in series, and then the pixel-level localization of the tampered area was realized through a special convolution operation. Experimental results show that the F1 score and Area Under Curve (AUC) value of the proposed network on COVERAGE dataset are better than those of the comparison networks. The F1 score of the proposed network is 9.8 and 7.7 percentage points higher than that of TED-Net (Two-stream Encoder-Decoder Network) on NIST16 and Columbia datasets, and the AUC increases by 1.1 and 6.5 percentage points, respectively. The proposed network achieves good results in copy-move tampering type detection, and is also suitable for other tampering type detection. At the same time, the proposed network can locate the tampered area at pixel level accurately, and its detection performance is superior to the comparison networks.
Visual localization and mapping system is affected by dynamic objects in a dynamic environment, so that it has increase of localization and mapping errors and decrease of robustness. And motion segmentation of input images can significantly improve the performance of visual localization and mapping system in dynamic environment. Dynamic objects in dynamic environment can be divided into moving objects and potential moving objects. Current dynamic object recognition methods have problems of chaotic moving subjects and poor real-time performance. Therefore, motion segmentation strategies of visual localization and mapping system in dynamic environment were reviewed. Firstly, the strategies were divided into three types of methods according to preset conditions of the scene: methods based on static assumption of image subject, methods based on prior semantic knowledge and multi-sensor fusion methods without assumption. Then, these three types of methods were summarized, and their accuracy and real-time performance were analyzed. Finally, aiming at the difficulty of balancing accuracy and real-time performance of motion segmentation strategy of visual localization and mapping system in dynamic environment, development trends of the motion segmentation methods in dynamic environment were discussed and prospected.
Aiming at the problems that existing image deblurring algorithms suffer from diffusion and artifacts when dealing with edge loss and the use of full-frame deblurring in video processing does not meet real-time requirements, an Adaptive DeBlurring Generative Adversarial Network (ADBGAN)algorithm based on active discrimination mechanism was proposed. Firstly, an adaptive fuzzy discrimination mechanism was proposed, and an adaptive fuzzy processing network module was developed to make a priori judgment of fuzziness on the input image. When collecting the input, the blurring degree of the input image was judged in advance, and the input frame which was clear enough was eliminated to improve the running efficiency of the algorithm. Then, the incentive link of the attention mechanism was introduced in the process of fine feature extraction, so that weight normalization was carried out in the forward flow of feature extraction to improve the performance of the network to recover fine-grained features. Finally, the feature pyramid fine feature recovery structure was improved in the generator architecture, and a more lightweight feature fusion process was adopted to improve the running efficiency. In order to verify the effectiveness of the algorithm, detailed comparison experiments were conducted on the open source datasets GoPro and Kohler. Experimental results on GoPro dataset show that the visual fidelity of ADBGAN is 2.1 times that of Scale-Recurrent Network (SRN) algorithm, the Peak Signal-to-Noise Ratio (PSNR) of ADBGAN is improved by 0.762 dB compared with that of SRN algorithm, and ADBGAN has good image information recovery ability; in terms of video processing time,the actual processing time is reduced by 85.9% compared to SRN.The proposed algorithm can generate deblurred images with higher information quality efficiently.
With the deepening of population aging, fall detection has become a key issue in the medical and health field. Concerning the low accuracy of fall detection algorithms in complex scenes, an improved fall detection model PDD-FCOS (PVT DRFPN DIoU-Fully Convolutional One-Stage object detection) was proposed. Pyramid Vision Transformer (PVT) was introduced into the backbone network of baseline FCOS algorithm to extract richer semantic information without increasing the amount of computation. In the feature information fusion stage, Double Refinement Feature Pyramid Networks (DRFPN) were inserted to learn the positions and other information of sampling points between feature maps more accurately, and more accurate semantic relationship between feature channels was captured by context information to improve the detection performance. In the training stage, the bounding box regression was carried out by the Distance Intersection Over Union (DIoU) loss. By optimizing the distance between the prediction box and the center point of the object box, the regression box was made to converge faster and more accurately, which improved the accuracy of the fall detection algorithm effectively. Experimental results show that on the open-source dataset Fall detection Database, the mean Average Precision (mAP) of the proposed model reaches 82.2%, which is improved by 6.4 percentage points compared with that of the baseline FCOS algorithm, and the proposed algorithm has accuracy improvement and better generalization ability compared with other state-of-the-art fall detection algorithms.
To remove Gibbs artifacts in Magnetic Resonance Imaging (MRI), a Self-attention connection UNet based on Self-Distillation training (SD-SacUNet) algorithm was proposed. In order to reduce the semantic gap between the encoding and decoding features at both ends of the skip connection in the UNet framework and help to capture the location information of artifacts, the output features of each down-sampling layer at the UNet encoding end was input to the corresponding self-attention connection module for the calculation of the self-attention mechanism, then they were fused with the decoding features to participate in the reconstruction of the features. Self-distillation training was performed on the network decoding end, by establishing the loss function between the deep and shallow features, the feature information of the deep reconstruction network was used to guide the training of the shallow network, and at the same time, the entire network was optimized to improve the level of image reconstruction quality. The performance of SD-SacUNet algorithm was evaluated on the public MRI dataset CC359, with the Peak Signal-to-Noise Ratio (PSNR) of 30.261 dB and the Structure Similarity Index Measure (SSIM) of 0.917 9. Compared with GRACNN (Gibbs-Ringing Artifact reduction using Convolutional Neural Network), the proposed algorithm had the PSNR increased by 0.77 dB and SSIM increased by 0.018 3; compared with SwinIR (Image Restoration using Swin Transformer), the proposed algorithm had the PSNR increased by 0.14 dB and SSIM increased by 0.003 3. Experimental results show that SD-SacUNet algorithm improves the image reconstruction performance of MRI with Gibbs artifacts removal and has potential application values.
Under the influence of complex weather such as typhoon, heavy fog, rain and snow, as well as occlusions and scale changes, the existing ship detection methods have the problems of false detection and missed detection. In order to solve the above complex scene problems, based on YOLOX-S model, a multi-scale ship detection method based on adaptive feature fusion was proposed. Firstly, a feature augmentation module was introduced into the backbone feature extraction network to suppress the interference of complex background noise on ship feature extraction. Then, considering the problem of deep and shallow feature fusion proportion, an adaptive feature fusion module was designed to make full use of deep and shallow features, thereby improving the multi-scale ship detection ability of the model. Finally, in the detection head network, the detection head was decoupled and an adaptive multi-task loss function was introduced to balance classification tasks and regression tasks, thereby improving the multi-scale ship detection robustness of the model. Experimental results show that the detection mean Average Precision (mAP) of the proposed method on the public ship detection datasets SeaShips and McShips is 97.43% and 96.10%, respectively. The detection speed of the proposed method reaches 189 frames per second, which meets the requirements of real-time detection, demonstrating that the proposed method achieves high-precision detection of multi-scale ship targets even in complex scenes.
Most of the existing code obfuscation solutions are limited to a specific programming language or a platform, which are not widespread and general. Moreover, control flow obfuscation and data obfuscation introduce additional overhead. Aiming at the above problems, an identifier obfuscation method was proposed based on Low Level Virtual Machine (LLVM). Four identifier obfuscation algorithms were implemented in the method, including random identifier algorithm, overload induction algorithm, abnormal identifier algorithm, and high-frequency word replacement algorithm. At the same time, a new hybrid obfuscation algorithm was designed by combining these algorithms. In the proposed method, firstly, in the intermediate files compiled by the front-ends, the function names, which met the obfuscation criteria, were selected. Secondly, these function names were processed by using specific obfuscation algorithms. Finally, the obfuscated files were transformed into binary files by using specific compilation back-ends. The identifier obfuscation method based on LLVM is suitable for the languages supported by LLVM and does not affect the normal functions of the program. For different programming languages, the time overhead is within 20% and the space overhead hardly increases. At the same time, the average confusion ratio of the program is 77.5%, and compared with the single replacement algorithm and overload algorithm, the proposed mixed identifier algorithm can provide stronger concealment in theoretical analysis. Experimental results show that the proposed method has the characteristics of low-performance overhead, strong concealment, and wide versatility.
Concerning the large number of computing needs of vehicle task offloading and the limited computing capacity of local edge servers in the Internet of Vehicles (IoV), a Hierarchical Resource Allocation Mechanism of cooperative mobile edge computing (HRAM) was proposed. In this algorithm, the computing resources of Mobile Edge Computing (MEC) servers were reasonably allocated and effectively utilized with a multi-layer architecture,so that the data multi-hop forwarding delay between different MEC servers was reduced, and the delay of task offloading requests was optimized. Firstly, the system model, communication model, decision model, and calculation model of the IoV edge computing were built. Next, the Analytic Hierarchy Process (AHP) was used to comprehensively consider multiple factors to determine the target server the offloaded task transferred to. Finally, a task routing strategy with dynamic weights was proposed to make use of communication capabilities of the overall network to shorten the request delay of task offloading. Simulation results show that compared with Resource Allocation of Task Offloading in Single-hop (RATOS) algorithm and Resource Allocation of Task Offloading in Multi-hop (RATOM) algorithm, HRAM algorithm reduces the request delay of task offloading by 40.16% and 19.01% respectively, and this algorithm can satisfy the computing needs of more offloaded tasks under the premise of meeting the maximum tolerable delay.
Aiming at the universal No-Reference Image Quality Assessment (NR-IQA) algorithms, a new NR-IQA algorithm based on the saliency deep features of the pseudo reference image was proposed. Firstly, based on the distorted image, the corresponding pseudo reference image of the distorted image generated by ConSinGAN model was used as compensation information of the distorted image, thereby making up for the weakness of NR-IQA methods: lacking real reference information. Secondly, the saliency information of the pseudo reference image was extracted, and the pseudo saliency map and the distorted image were input into VGG16 netwok to extract deep features. Finally, the obtained deep features were merged and mapped into the regression network composed of fully connected layers to obtain a quality prediction consistent with human vision.Experiments were conducted on four large public image datasets TID2013, TID2008, CSIQ and LIVE to prove the effectiveness of the proposed algorithm. The results show that the Spearman Rank-Order Correlation Coefficient (SROCC) of the proposed algorithm on the TID2013 dataset is 5 percentage points higher than that of H-IQA (Hallucinated-IQA) algorithm and 14 percentage points higher than that of RankIQA (learning from Rankings for no-reference IQA) algorithm. The proposed algorithm also has stable performance for the single distortion types. Experimental results indicate that the proposed algorithm is superior to the existing mainstream Full-Reference Image Quality Assessment (FR-IQA) and NR-IQA algorithms, and is consistent with human subjective perception performance.
Concerning the problem that the traditional access control methods face single point of failure and fail to provide trusted, secure and dynamic access management, a new access control model based on blockchain and smart contract for Wireless Sensor Network (WSN) was proposed to solve the problems of access dynamics and low level of intelligence of existing blockchain-based access control methods. Firstly, a new access control architecture based on blockchain was proposed to reduce the network computing overhead. Secondly, a multi-level smart contract system including Agent Contract (AC), Authority Management Contract (AMC) and Access Control Contract (ACC) was built, thereby realizing the trusted and dynamic access management of WSN. Finally, the dynamic access generation algorithm based on Radial Basis Function (RBF) neural network was adopted, and access policy was combined to generate the credit score threshold of access node to realize the intelligent, dynamic access control management for the large number of sensors in WSN. Experimental results verify the availability, security and effectiveness of the proposed model in WSN secure access control applications.
To strengthen the control and management of local airspace routes, a route discovery method based on trajectory point clustering was proposed. Firstly, for the simulation data generated according to the distribution characteristics of the real data, the pre-processing module was used to weaken and remove the noise of the trajectory data. Secondly, a route discovery method including outlier elimination, trajectory resampling, trajectory point clustering, clustering center correction, and connecting clustering centers was proposed to extract the routes. Finally, the result of route extraction was visualized and the proposed method was validated using civil aviation data. The experimental results on the simulated data show that the node coverage and the length coverage of the proposed method is 99% and 94% respectively, under the noise intensity of 0.1° and the buffer area of 30 km. Compared with the rasterization method, the proposed method has higher accuracy and can extract the routes more effectively, achieving the purpose of extracting the common routes of aircraft.
In the field of deep medical image segmentation, TransUNet (merit both Transformers and U-Net) is one of the current advanced segmentation models. However, the local connection between adjacent blocks in its encoder is not considered, and the inter-channel information is not interactive during the upsampling process of the decoder. To address the above problems, a Multi-attention FUsion Network (MFUNet) model was proposed. Firstly, a Feature Fusion Module (FFM) was introduced in encoder part to enhance the local connections between adjacent blocks in the Transformer and maintain the spatial location relationships of the images themselves. Then, a Double Channel Attention (DCA) module was introduced in the decoder part to fuse the channel information of multi-level features, which enhanced the sensitivity of the model to the key information between channels. Finally, the model's constraints on the segmentation results was strengthened by combining cross-entropy loss and Dice loss. By conducting experiments on Synapse and ACDC public datasets, it can be seen that MFUNet achieves Dice Similarity Coefficient (DSC) of 81.06% and 90.91%, respectively. Compared with the baseline model TransUNet, MFUNet achieved an 11.5% reduction in Hausdorff Distance (HD) on the Synapse dataset, and improved segmentation accuracy by 1.43 and 3.48 percentage points on the ACDC dataset for both the right ventricular and myocardial components, respectively. The experimental results show that MFUNet can achieve better segmentation results in both internal filling and edge prediction of medical images, which can help improve the diagnostic efficiency of doctors in clinical practice.
Focusing on the issue that the single point of failure cannot be efficiently handled by streaming data processing system Flink, a new fault?tolerant system based on incremental state and backup, Flink+, was proposed. Firstly, backup operators and data paths were established in advance. Secondly, the output data in the data flow diagram was cached, and disks were used if necessary. Thirdly, task state synchronization was performed during system snapshots. Finally, backup tasks and cached data were used to recover calculation in case of system failure. In the system experiment and test, Flink+ dose not significantly increase the additional fault tolerance overhead during fault?free operation; when dealing with the single point of failure in both single?machine and distributed environments, compared with Flink system, the proposed system has the failure recovery time reduced by 96.98% in single?machine 8?task parallelism and by 88.75% in distributed 16?task parallelism. Experimental results show that using incremental state and backup method together can effectively reduce the recovery time of the single point of failure of the stream system and enhance the robustness of the system.
In commercial digital cameras, due to the limitation of Complementary Metal Oxide Semiconductor (CMOS) sensors, there is only one color channel information for each pixel in the sampled image. Therefore, the Color image DeMosaicking (CDM) algorithm is required to restore the full-color images. However, most of the existing Convolutional Neural Network (CNN)-based CDM algorithms cannot achieve satisfactory performance with relatively low computational complexity and small network parameter number. To solve this problem, a CDM network based on Inter-channel Correlation and Enhanced Information Distillation (ICEID) was proposed. Firstly, to fully utilize the inter-channel correlation of the color image, an inter-channel guided reconstruction structure was designed to obtain the initial CDM result. Secondly, an Enhanced Information Distillation Module (EIDM), which can effectively extract and refine features from image with relatively small parameter number, was presented to enhance the reconstructed full-color image in high efficiency. Experimental results demonstrate that compared with many state-of-the-art CDM methods, the proposed algorithm achieves significant improvement in both objective quality and subjective quality, and has relatively low computational complexity and small network parameter number.
Attributed graph embedding aims to represent the nodes in an attributed graph into low-dimensional vectors while preserving the topology information and attribute information of the nodes. There are lots of works related to attributed graph embedding. However, most of algorithms proposed in them are supervised or semi-supervised. In practical applications, the number of nodes that need to be labeled is large, which makes these algorithms difficult and consume huge manpower and material resources. Above problems were reanalyzed from an unsupervised perspective, and an unsupervised attributed graph embedding algorithm was proposed. Firstly, the topology information and attribute information of the nodes were calculated respectively by using the existing non-attributed graph embedding algorithm and attributes of the attributed graph. Then, the embedding vector of the nodes was obtained by using Graph Convolutional Network (GCN), and the difference between the embedding vector and the topology information and the difference between the embedding vector and attribute information were minimized. Finally, similar embeddings was obtained by the paired nodes with similar topological information and attribute information. Compared with Graph Auto-Encoder (GAE) method, the proposed method has the node classification accuracy improved by 1.2 percentage points and 2.4 percentage points on Cora and Citeseer datasets respectively. Experimental results show that the proposed method can effectively improve the quality of the generated embedding.
To improve the computational efficiency of stereo matching on foreground disparity estimation tasks, aiming at the disadvantage that the general networks use the complete binocular image as input and the input information redundancy is large due to the small proportion of the foreground space in the scene, a real-time target stereo matching algorithm based on sparse convolution was proposed. In order to realize and improve the sparse foreground disparity estimation of the algorithm, firstly, the sparse foreground mask and scene semantic features were obtained by the segmentation algorithm at the same time. Secondly, the sparse convolution was used to extract the spatial features of the foreground sparse region, and scene semantic features were fused with them. Then, the fused features were input into the decoding module for disparity regression. Finally, the foreground truth graph was used as the loss to generate the disparity graph. The test results on ApolloScape dataset show that the accuracy and real-time performance of the proposed algorithm are better than those of the state-of-the-art algorithms PSMNet (Pyramid Stereo Matching Network) and GANet (Guided Aggregation Network), and the single run time of the algorithm is as low as 60.5 ms. In addition, the proposed algorithm has certain robustness to the foreground occlusion, and can be used for the real-time depth estimation of targets.
In the production of printing industry, using You Only Look Once version 4 (YOLOv4) directly to detect printing defect targets has low accuracy and requires a large number of training samples. In order to solve the problems, a defect target detection method for printed matter based on Siamese-YOLOv4 was proposed. Firstly, a strategy of image segmentation and random parameter change was used to enhance the dataset. Then, the Siamese similarity detection network was added to the backbone network, and the Mish activation function was introduced into the similarity detection network to calculate the similarity of image blocks. After that, the regions with similarity below the threshold were regarded as the defect candidate regions. Finally, the candidate region images were trained to achieve the precise positioning and classification of defect targets. Experimental results show that, the detection precision of the proposed Siamese-YOLOv4 model is better than those of the mainstream target detection models. On the printing defect dataset, the Siamese-YOLOv4 network has the detection precision for satellite ink droplet defect of 98.6%, the detection precision for dirty spot of 97.8%, the detection precision for print lack of 93.9%; and the mean Average Precision (mAP) reaches 96.8%, which is 6.5 percentage points,6.4 percentage points, 14.9 percentage points and 10.6 percentage points higher respectively than the YOLOv4 algorithm, the Faster Regional Convolutional Neural Network (Faster R-CNN) algorithm, the Single Shot multibox Detector (SSD) algorithm and the EfficientDet algorithm. The proposed Siamese-YOLOv4 model has low false positive rate and miss rate in the defect detection of printed matter, and improves the detection precision by calculating similarity of the image blocks through the similarity detection network, proving that the proposed defect detection method can be applied to the printing quality inspection and therefore improve the defect detection level of printing enterprises.
Authorship attribution is the task of deciding who is the author of a particular document, however, the traditional methods for authorship attribution are target-independent without considering any constraint during the prediction of authorship, which is inconsistent with the actual problems. To address the above issue, a Target-Dependent method for Authorship Attribution (TDAA) was proposed. Firstly, the product ID corresponding to the user review was chosen to be the constraint information. Secondly, Bidirectional Encoder Representation from Transformer (BERT) was used to extract the pre-trained review text feature to make the text modeling process more universal. Thirdly, the Convolutional Neural Network (CNN) was used to extract the deep features of the text. Finally, two fusion methods were proposed to fuse the two different information. Experimental results on Amazon Movie_and_TV dataset and CDs_and_Vinyl_5 dataset show that the proposed method can increase the accuracy by 4%-5% compared with the comparison methods.